Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: add tests for meps dataset #38

Merged
merged 31 commits into from
Jun 4, 2024

Conversation

SimonKamuk
Copy link
Contributor

@SimonKamuk SimonKamuk commented May 22, 2024

Implemeted tests for loading a reduced size meps example dataset, creating graphs, and training model.

Work TODO:

  • reduce number of variables, size of domain etc in Joel's MEPS data example so that the zip file is less than 500MB. Calling it meps_example_reduced
  • create test-data zip file and upload to EWC (credentials from @leifdenby)
  • implement test using pytorch to download and unpack testdata using pooch
  • Implement testing of:
    • initiation of neural_lam.weather_dataset.WeatherDataset from downloaded data
    • check shapes of returned parts of training item
    • create new graph in tests for reduced dataset
    • feed single batch through model and check shape of output
  • add github action to run tests during ci/cd

closes #30

SimonKamuk and others added 3 commits May 22, 2024 15:36
Also allowed calling train_model.main with arguments (will still use sys.argv when no arguments are supplied in the function call
@SimonKamuk SimonKamuk added this to the v0.2.0 milestone May 22, 2024
@SimonKamuk SimonKamuk marked this pull request as ready for review May 27, 2024 12:18
@SimonKamuk SimonKamuk requested a review from leifdenby May 27, 2024 12:19
Copy link
Member

@leifdenby leifdenby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! It will be really nice to have this automated testing on a real dataset. What do you think to creating a file called DEVELOPING.md in the repo root where you detail the contents of the test data a bit more?

@sadamov could you give this a read through too please?

tests/test_mllam_dataset.py Outdated Show resolved Hide resolved
tests/test_mllam_dataset.py Show resolved Hide resolved
tests/test_mllam_dataset.py Show resolved Hide resolved
neural_lam/vis.py Outdated Show resolved Hide resolved
neural_lam/utils.py Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
tests/test_mllam_dataset.py Show resolved Hide resolved
 - removed linting dependencies
 - minor changes to test file
 - added notebook outlining generation of meps_example_reduced from meps_example
@SimonKamuk
Copy link
Contributor Author

Looks great to me! It will be really nice to have this automated testing on a real dataset. What do you think to creating a file called DEVELOPING.md in the repo root where you detail the contents of the test data a bit more?

@sadamov could you give this a read through too please?

I added this as a notebook instead, since I could then include the code directly. Let me know if you would still prefer just a md file :)

@sadamov
Copy link
Collaborator

sadamov commented May 29, 2024

Looks great to me! It will be really nice to have this automated testing on a real dataset. What do you think to creating a file called DEVELOPING.md in the repo root where you detail the contents of the test data a bit more?
@sadamov could you give this a read through too please?

I added this as a notebook instead, since I could then include the code directly. Let me know if you would still prefer just a md file :)

Could we move this and all future notebooks to a neural-lam/notebooks folder? Then I would also give this notebook a more desciptive name like create_reduced_meps_dataset.ipynb

Copy link
Collaborator

@sadamov sadamov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice @SimonKamuk, we definitely needed tests! Well done 🚀
I tested this with a push to mllam-repo and all tests run smoothly.

Otherwise, I only requested some smaller changes. Thanks again Simon!

DEVELOPING.ipynb Outdated Show resolved Hide resolved
tests/test_mllam_dataset.py Show resolved Hide resolved
tests/test_mllam_dataset.py Show resolved Hide resolved
.github/workflows/run_tests.yml Show resolved Hide resolved
Copy link
Member

@leifdenby leifdenby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have two minor comments about comments you could add to the code. Otherwise this is good to go.

Remember to update the changelog to say that you have added testing of dataset loading, graph creation and training based on reduced MEPS dataset stored on AWS S3! This is a big step!!

Also I think it would be nice to add to the top of the README cicd badges for the testing if @sadamov and @joeloskarsson you are ok with that?

DEVELOPING.ipynb Outdated Show resolved Hide resolved
tests/test_mllam_dataset.py Show resolved Hide resolved
Copy link
Collaborator

@joeloskarsson joeloskarsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very valuable to have! I had some questions about the added dependencies in requirements.py and also some follow-ups to earlier discussions that might be good to take a look at.

requirements.txt Outdated Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
neural_lam/utils.py Show resolved Hide resolved
DEVELOPING.ipynb Outdated Show resolved Hide resolved
@joeloskarsson
Copy link
Collaborator

joeloskarsson commented May 29, 2024

I only have two minor comments about comments you could add to the code. Otherwise this is good to go.

Remember to update the changelog to say that you have added testing of dataset loading, graph creation and training based on reduced MEPS dataset stored on AWS S3! This is a big step!!

Also I think it would be nice to add to the top of the README cicd badges for the testing if @sadamov and @joeloskarsson you are ok with that?

I have no opinion on cicd badges, so please go ahead if you'd like

Copy link
Member

@leifdenby leifdenby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just need to remove pytorch-geometric from the requirements.txt and then this is good to go!

Copy link
Collaborator

@joeloskarsson joeloskarsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looking good now I think. I have only read the code, but trust that others have tested that everything runs. Great work on this!

Copy link
Member

@leifdenby leifdenby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready!

@SimonKamuk
Copy link
Contributor Author

@sadamov Do you have any further requests or are we ready to merge? :)

@sadamov
Copy link
Collaborator

sadamov commented Jun 3, 2024

Ready go merge! 🚀

@sadamov sadamov self-requested a review June 3, 2024 16:32
@SimonKamuk SimonKamuk merged commit 81d0840 into mllam:main Jun 4, 2024
8 checks passed
@joeloskarsson
Copy link
Collaborator

@SimonKamuk at the moment I am seeing "Unit Tests: Failing" on the badge, even thought this is not the case for the main branch. I was looking at https://github.com/mllam/neural-lam/actions/workflows/run_tests.yml and it seems to be on some other branch they have failed. I really don't know much about how these badges work, so not sure how it is supposed to function 😅 Do you know if there is a way to tweak it so that it only applies to the main branch? Or would that require large changes to the way the github workflows are set up?

@SimonKamuk
Copy link
Contributor Author

@SimonKamuk at the moment I am seeing "Unit Tests: Failing" on the badge, even thought this is not the case for the main branch. I was looking at https://github.com/mllam/neural-lam/actions/workflows/run_tests.yml and it seems to be on some other branch they have failed. I really don't know much about how these badges work, so not sure how it is supposed to function 😅 Do you know if there is a way to tweak it so that it only applies to the main branch? Or would that require large changes to the way the github workflows are set up?

Yeah, I noticed that as well. I could specify that the badge should only refer to the main branch, but the issue is that the workflows are not running on pushes to main, so there are no runs to refer to. In the recent commit, I made these changes, but I am not sure what the implication of no longer ignoring pushes to main (why were they ignored in the first place @leifdenby?)

@joeloskarsson
Copy link
Collaborator

Ok, that's great! I guess running on push to main is generally unnecessary since we run everything before merging. However I don't think it matters if we run it an extra time just for the badge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add test for MEPS data example
4 participants